Add video_contact_sheet: Generate contact sheets from videos using keyframes #426

libran11 · 2025-05-13T06:15:58Z

🚀 Add video_contact_sheet: Generate contact sheets from videos using keyframes

Summary

This PR introduces a new advanced utility script video_contact_sheet which generates visual contact sheets from video files using scene-change keyframes.

Description

This tool extracts keyframes based on histogram difference, composes them into a grid with metadata (duration, resolution, codec), and exports them as JPEG images. It is particularly useful for dataset QA, content review, or automated preview generation.

The script supports both single-file and batch directory input, and includes multi-threaded processing for high efficiency. Full CLI interface and pytest-based test cases are included.

The changes are as follows:

- Added video_contact_sheet/ submodule with the following files:
- cli.py: CLI entry using click
- core.py: Core logic for keyframe extraction and contact sheet composition
- utils.py: FFmpeg metadata fetching and thread pool management
- tool_test/test_contact_sheet.py: Unit test using a generated sample video
- requirements.txt: Light dependencies list
- README.md: Project documentation
- Provided random sample video and automated testing
- Compatible with Pillow ≥10 (textbbox used)

Checks

in the repository

•	Made no changes that degrades the functioning of the repository
•	Gave each commit a better title (unlike updated README.md)

in the PR

•	Followed the format of the pull_request_template
•	Made the Pull Request in a small level (for the creator’s wellfare)
•	Tested the changes you made

Pull Request Overview

This PR introduces a new video processing utility called video_contact_sheet that generates visual contact sheets (filmstrip grids) from video files using scene-change keyframe detection. The tool is designed for video dataset QA, content review, and automated preview generation.

Core functionality for keyframe extraction using histogram difference and contact sheet composition
CLI interface with multi-threading support for batch processing
Utility functions for FFmpeg metadata extraction and parallel processing

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
video_contact_sheet/utils.py	FFmpeg metadata extraction and thread pool utilities
video_contact_sheet/requirements.txt	Project dependencies specification
video_contact_sheet/core.py	Core keyframe extraction and contact sheet generation logic
video_contact_sheet/cli.py	Command-line interface with click framework
video_contact_sheet/init.py	Package initialization and version definition
video_contact_sheet/README.md	Documentation and usage examples

Copilot · 2025-07-21T16:42:50Z

video_contact_sheet/utils.py

+    proc = subprocess.run(
+        FFPROBE_CMD + [str(path)], capture_output=True, text=True, check=True
+    )
+    return json.loads(proc.stdout)


The subprocess.run call lacks proper error handling. When check=True, subprocess.CalledProcessError will be raised but not handled, making it difficult for users to understand what went wrong with ffprobe execution.

Suggested change

proc = subprocess.run(

FFPROBE_CMD + [str(path)], capture_output=True, text=True, check=True

)

return json.loads(proc.stdout)

try:

proc = subprocess.run(

FFPROBE_CMD + [str(path)], capture_output=True, text=True, check=True

)

return json.loads(proc.stdout)

except subprocess.CalledProcessError as e:

raise RuntimeError(

f"ffprobe command failed for path {path}. Error: {e.stderr.strip()}"

) from e

Copilot · 2025-07-21T16:42:51Z

video_contact_sheet/core.py

@@ -0,0 +1,115 @@
+"""
+Core logic: Keyframe extraction, contact table splicing.


The term 'contact table splicing' should be 'contact sheet composition' to match the actual functionality and be consistent with terminology used elsewhere in the codebase.

Suggested change

Core logic: Keyframe extraction, contact table splicing.

Core logic: Keyframe extraction, contact sheet composition.

Copilot · 2025-07-21T16:42:51Z

video_contact_sheet/core.py

+
+def _collect_meta(video_path: Path) -> Dict:
+    info = ffprobe_metadata(video_path)
+    v_stream = next(s for s in info["streams"] if s["codec_type"] == "video")


The next() function will raise StopIteration if no video stream is found. This should use next() with a default value or proper exception handling to avoid crashes when processing files without video streams.

Suggested change

v_stream = next(s for s in info["streams"] if s["codec_type"] == "video")

v_stream = next((s for s in info["streams"] if s["codec_type"] == "video"), None)

if v_stream is None:

raise ValueError(f"No video stream found in file: {video_path}")

Copilot · 2025-07-21T16:42:51Z

video_contact_sheet/README.md

+Scene-change detection for “interesting” keyframes
+Multithreaded extraction using OpenCV + ffmpeg
+Footer shows duration / resolution / codec
+Pure-Python, works on Windows/Linux/macOS


The markdown formatting is missing proper bullet points. Lines 15-18 should be formatted as a proper markdown list with '- ' prefix for each feature.

Suggested change

Scene-change detection for “interesting” keyframes

Multithreaded extraction using OpenCV + ffmpeg

Footer shows duration / resolution / codec

Pure-Python, works on Windows/Linux/macOS

- Scene-change detection for “interesting” keyframes

- Multithreaded extraction using OpenCV + ffmpeg

- Footer shows duration / resolution / codec

- Pure-Python, works on Windows/Linux/macOS

Copilot · 2025-07-21T16:42:52Z

video_contact_sheet/core.py

+FONT = ImageFont.load_default()
+
+
+def make_contact_sheet(
+    frames: List[np.ndarray],
+    metadata: Dict,
+    cols: int = 5,
+    margin: int = 8,
+) -> Image.Image:
+


[nitpick] The global font variable should be initialized inside the function or made configurable. Global state can cause issues in multi-threaded environments and makes the code less flexible.

Suggested change

FONT = ImageFont.load_default()

def make_contact_sheet(

frames: List[np.ndarray],

metadata: Dict,

cols: int = 5,

margin: int = 8,

) -> Image.Image:

def make_contact_sheet(

frames: List[np.ndarray],

metadata: Dict,

cols: int = 5,

margin: int = 8,

font: ImageFont.ImageFont = None,

) -> Image.Image:

if font is None:

font = ImageFont.load_default()

Copilot · 2025-07-21T16:42:52Z

video_contact_sheet/cli.py

+    vids = []
+    for p in inputs:
+        if p.is_dir():
+            vids.extend(list(p.rglob("*.mp4")))
+        else:
+            vids.append(p)


The video discovery is hardcoded to only find .mp4 files. Consider supporting multiple video formats or making the file extensions configurable to handle common video formats like .avi, .mov, .mkv, etc.

Suggested change

vids = []

for p in inputs:

if p.is_dir():

vids.extend(list(p.rglob("*.mp4")))

else:

vids.append(p)

# Define supported video extensions

supported_extensions = [".mp4", ".avi", ".mov", ".mkv"]

vids = []

for p in inputs:

if p.is_dir():

for ext in supported_extensions:

vids.extend(list(p.rglob(f"*{ext}")))

else:

if p.suffix in supported_extensions:

vids.append(p)

libran11 and others added 5 commits May 13, 2025 13:16

init video contact sheet

7dbc369

utils compelete

bb76446

core function

af53649

cli tools

4b03164

core func and fix bugs

f467aa3

DhanushNehru requested a review from Copilot July 21, 2025 16:41

Copilot AI reviewed Jul 21, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add video_contact_sheet: Generate contact sheets from videos using keyframes #426

Add video_contact_sheet: Generate contact sheets from videos using keyframes #426

Uh oh!

libran11 commented May 13, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jul 21, 2025

Uh oh!

Copilot AI Jul 21, 2025

Uh oh!

Copilot AI Jul 21, 2025

Uh oh!

Copilot AI Jul 21, 2025

Uh oh!

Copilot AI Jul 21, 2025

Uh oh!

Copilot AI Jul 21, 2025

Uh oh!

Uh oh!

		@@ -0,0 +1,115 @@
		"""
		Core logic: Keyframe extraction, contact table splicing.

-    v_stream = next(s for s in info["streams"] if s["codec_type"] == "video")
+    v_stream = next((s for s in info["streams"] if s["codec_type"] == "video"), None)
+    if v_stream is None:
+        raise ValueError(f"No video stream found in file: {video_path}")

Uh oh!

Add video_contact_sheet: Generate contact sheets from videos using keyframes #426

Are you sure you want to change the base?

Add video_contact_sheet: Generate contact sheets from videos using keyframes #426

Uh oh!

Conversation

libran11 commented May 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🚀 Add video_contact_sheet: Generate contact sheets from videos using keyframes

Summary

Description

Checks

in the repository

in the PR

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

libran11 commented May 13, 2025 •

edited

Loading